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1. INTRODUCTION 

Cloud computing (CC) is nothing but specific computing paradigm that provides the absolute 
management and services on the internet; further, cloud computing is also referred as the fine computing 
environment with characteristics of being convenient, sharing and on demand services which includes 
servers, storage, applications, services, networks and so on. Moreover, these can be provided in rapid and 
efficient manner without much effort [1]. various application and resources; virtual machine (VM) distributes 
the various resource such as processing cores, system BUS and so on, however these resources are restricted 
through processing power phenomena [2]. Cloud computing have various application from industrial point as 
well as academic point as well as industrial point; moreover, in order to take advantage of computing 
resources, workflow is designed. Workflows comprises large number of dependent and independent task and 
possesses huge applications in scientific purpose as well as business purpose; these application includes bio- 
informatics, medical, weather forecasting, astronomy, and so forth [3], [4]. 

Figure 1 shows the simple workflow and scientific workflow is represented in the performance 
evaluation section. Moreover, Figure | shows that there are 9 tasks. Simple workflow is easy to schedule, 
however scientific workflow is quite complicated for example montage, epigenomics, SIPHT and cybershake 
workflow applied in various research like earthquakes, astronomy and so on. Moreover, scientific workflow 
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involves complex data of various size and requires the higher processing power; nevertheless, through 
providing the on-demand virtual resource, cloud-computing paradigm helps in handling these workflows. 
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Figure 1. Simple workflow 


Early work of load balancing was more focus towards traditional approach like dynamic load 
balancing, round robin, ant colony optimization, first come first serve (FCFS). However, most common 
approach used were first input first output (FIFO) and weighted round robin (WRR). Nevertheless, it also 
supports simulation and modelling of cloud computing environments which consists of inter-networked and 
single clouds, also it exposes the customized interface to implement load balancing and scheduling policies 
into the VMs as well as provisioning mechanism for VM allocation, Abdullahi et al. [5], Yaghmaee et al. [6] 
for more details. Moreover, considering the importance of workflow scheduling, several research were 
carried out which is extensively discussed in literature review section; few approaches are single approach or 
hybrid approach like meta-heuristic mechanism produce better results in comparison with other model. 
Moreover, methods like heterogeneous earliest finish time (HEFT) is heterogeneous mechanism where each 
task are mapped in priority order to virtual machine explained in [7]-{9]. Other method like gravitational 
search algorithm (GSA) utilizes gravitational law, as it starts with random particles set and each particle 
provides solution and mass are computed using the fitness function as explained by researchers [10]-[14]. 
Figure 2 shows the workflow of scheduling. Power consumption and simplicity and critical task is to handle 
the various challenges which are derived from the cloud model [13], [14]. 
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Figure 2. Workflow in cloud computing 
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Workflow scheduling is considered as the NP-complete problem that are widely researched for 
various paradigms including cluster computing and grid environment. Further, through cloud-based 
environment the scientific workflow can be scheduled in efficient manner. However, for huge value of m and 
n, achieving the optimal solution through existing method is realy expensive such as brute force and its 
lengthy process. Although methods like metaheuristic approach have been proposed but they suffer from 
their own demerits. Hence motivated by the above-discussed points, in this research work we design and 
develop a mechanism optimized load balancing in parallel computation (OLBP) to distribute the load in 
efficient manner for achieving the improvisation in workflow scheduling; further contribution of research 
work is highlighted: at first, we carry extensive survey of existing mechanism to optimize the energy and 
makespan. Further, an optimized load-balancing scheme in parallel computing is designed so that loads are 
distributed in a manner, OLBP considers the task offloading through server speed. Moreover, execution 
speed and energy consumption are taken as constraint and further it is optimized integrated with load 
balancing approach. OLBP is evaluated considering the scientific workflow cybershake and its variant; four 
distinctive instances are created and compared with the existing protocol to prove the efficiency of proposed 
mechanism. The first component of this research begins with a backdrop of computation and the need for the 
cloud computing phenomenon. We next address the importance of scheduling mechanisms before concluding 
with motivation and the value of the research work. The second segment examines the various current 
protocols and their shortcomings; the third section also develops OLBP and its mathematical formulation. In 
the fourth segment, OLBP is assessed by taking into account numerous instances. 


2. LITERATURE SURVEY 

In cloud computing paradigm, the phenomena of task scheduling are to identify the optimal 
mapping among the virtual machine and tasks in accordance with cloud model and users’ goal such as less 
execution time and optimal productivity. This can be achieved through load balancing and optimization 
mechanism as shown by Yin ef al. [15], Gao et al. [16]. In general, single optimization mechanism approach 
were developed namely Min-Min by Patel et al. [17], MaxMin Mao et al. [18], Kong et al. [19]; further, 
improvisation has been explained by Zhang et al. [20] where segmented Min-Min approach were developed. 
In here, task was pre-ordered before the completion time, later designed sequence was segmented and 
MinMin approach were applied to these designed segments. Although algorithm performed better than the 
MinMin when task length changed dramatically through providing longer task to be executed first. Etminani 
and Naghibzadeh [21] adopted a novel scheduling mechanism based on MaxMin and MinMin; Further 
standard deviation (SD) was computed in from both algorithm on virtual machines, also Devipriya and 
Ramesh [22] developed improvised MaxMin algorithm depending on scheduling large, task execution time to 
virtual machines with less computing speed, this method successfully minimized the task execution time. 
Although the above improvised version of MinMin and MaxMin minimizes the task execution time and it is 
easy to perform but these approaches lead to unbalancing load and long-term overload affects quality of 
service which is eminent in workflow scheduling, for more details refer [23], [24]. 

Multi-objective optimization approaches were created to address these drawbacks. The researchers 
[25]-[27] have created an LB-ACO-based algorithm that use the ant colony optimization (ACO) mechanism 
to get the best results; non-domination sorting is then used to produce a Pareto solution set that reflects the 
tradeoff between load balancing with makespan time in a cloud computing environment. Furthermore, a task- 
scheduling mechanism based on the genetic ACO mechanism was created in [28]. As described by Cui and 
Zhang [29], genetic algorithm (GA) based task scheduling approach were developed based on GA; this 
approach was based on the two-dimensional coding mechanism, also genetic mutation and crossover 
operation were designed to produce the new offspring which for improvising the population diversity. Li et 
al. [30], Xiao [31] moreover have integrated the three mechanisms i.e., artificial bee colony (ABC), GA and 
ACO, a sharing module is developed for sharing the optimal solutions found through three algorithm and 
further solution space was created. Integration of these three model and improvised convergence accuracies 
mentioned by Wu and Wang [32] developed an estimation of distribution algorithm (EDA) approach for 
solving the parallel scheduling mechanism that has the priority constraint. Further, in literature survey we 
observed that most of the method ignores the load balancing mechanism and focus on single objective such 
as task execution time or power optimization as mentioned by Aziza and Krichen [33], Long et al. [34]. 


3. METHOD 

In this section, we design and develop an optimized load balancing mechanism in Parallel 
computation; optimized load balancing distributes the load in efficient way such that the various constraint 
related to workflow is fulfilled. Task scheduling as well as resource provisioning are two obstacles that load 
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balancing poses in any workflow. In addition, the key goals of cloud service providers are to deliver 
dependable services at an affordable price with improved resource usage, availability, and energy efficiency. 
In case of independent task, the order of execution is not important, however for dependent task, execution 
order is priority, as the execution of priority task is necessary, and it gets more complicated. Direct acyclic 
graph (DAG) which represents the workflow where edge indicates the task and edge indicates the relation 
between two nodes. Figure 3 shows the load balancing mechanism and comprises various parts such as users, 
data center controller (DCC), load balancer where the load balancing is carried out through proposed 
approach, VM aka Virtual Machine manager which manages VM, VM monitor that monitor VM activity and 
physical server; in this work we focus on designing load balancing part. 
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Figure 3. General load balancing 


3.1. Optimized load balancing in parallel computation-OLBP steps 
In this section, we go through the steps of optimized load balancing algorithm. The steps for OLBP 
algorithm: 
Step 1: Initiate. 
Step 2: Creating the analytical paradigm for parallel and heterogeneous computing. 
Step 3: Power utilization is formulated. 
Step 4: The designed analytical model divides the model into two constraints. 
Step 4: We use the task’s fixed arrival rate to calculate the service fee. 
Step 5: The number of people waiting, task response time, busy server, server usage, and server speed are all 
calculated. 
Step 6: Further, we design the load balancing through server speed and later task offloading is carried out 
considering makespan and power consumption. 


3.2. System model 

To design the load balancing mechanism in workflow, heterogeneous server model is designed; let’s 
consider a m heterogeneous server Ry,R2,R3,..Rm of speed r,,12,F3,.-%m and size ]y,12,13,1,...,1m- 
Further, this model is treated as a particular model. Further, lets assume that Ry has 1, identical server with ry, 
speed. Let’s consider the task with arrival time A where task arrival times are independent and shared; further 


load distribution mechanism is parted into various sub-mechanism, hence task execution time is, let ¢, = 

ae “1 be the service rate where average number of tasks finished through S; server in given unit of time, 
h 

then server utilization can be computed ibn (1). The above example can be considered as the average time 

when server is busy; let 0,; be the probability where there are j tasks in queue for Ry, as given in (2). 
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Further, probability of queueing in Ry is given as (4). 


Opn = Ono a = (4) 


Average task response is computed through (5). 


Ly = DpeoiOnj =IhXn + ere Opn (5) 
Average task in the model is given as: 
< Oph 
Sh = 1 P. ) 
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3.3. Power consumption 

Circuit delay as well as power dissipation in digital complementary metal-oxide-semiconductor 
(CMOS) circuits are modelled using a straightforward technique. The use of power is very important here. 
Can be used to solve the basic power problem. 


0 = zBUZe (8) 


The loading capacitance is denoted by B in (8), the activity factor by a, the supply voltage by U, and 
the clock frequency by e. Additionally, because the system in this case is heterogeneous, it is considered that 
the server Rp has rn, speed. Two separate possibilities are also taken into consideration. Given that there is no 
task to be completed in the initial scenario, the energy consumed can be given as (9). 

= * = S,-1 

On — 110; + Angonr, (9) 
Due to the server’s high-power consumption, speed is Oyreh with 1, server. Even though there are no tasks 
remaining to complete in the second situation, the server continues to run at the specified speed r h, allowing 
the suggested consumption to be calculated using the (10). 


On = 1n(O% + 9,72") (10) 


3.4. Optimal load balancing in parallel computing 

Let’s consider that server R;, server of 1, speed be the rp,j, here j is the task that is queued in the 
system; further, the task value should not be zero. Furthermore, we consider the speed mechanism i-e., 
(Tho Tha Th) +) Of Ryj, these speed scheme directly represent and reflect the power and speed dependency 
of workload. Moreover, this are the three different scenarios. Consider that the job that is queuing in the 
network is called j, and that server Ry server of speed 1, is called ry;. The performance expectancy must not 
be zero. Additionally, we consider the speed mechanism of R (h, j), namely (Tho, Thi Th3,-), and Rpyj, 
which directly represents and reflects the power as well as speed dependency of the workload. Additionally, 
the three distinct circumstances are listed after this. For instance, if ry1 = nz =Tn3--,= Rp then we use a 
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unit speed mechanism., If ry) = 0, then it is said to be idle, If ry,9 = ry, then it is said to be the constant 
speed mode. Additionally, arrival rate (AR) as well as service rate (SR) having various states | are processed 
for additional load balancing depending on speed and power, where | refers to the task within server model, 


, i 1<j<l,-1 a 
mS 1, 1>]h 

j -1 
Onj = (Ono Ay (Gh1» Sh,2 «+» Sh) (12) 


where 0}, 9 can be formulated through the (13). 
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The average number of tasks in Ry as (14). 

My = D1 jon; (14) 
The provides the task’s RT (response time), 

My = My (Ay)~? (15) 
the number of servers in Ry is later calculated and is supplied as (16). 

Ay = Yizty nj + Li20 nOn,j (16) 
Consequently, the above is used to calculate server usage, 

Xj = An(y)™* (17) 

Th = Ukto Onjs (18) 


the average energy consumption is calculated last, and it is also formulated for the two situations that were 
described before in the same section. 
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Further we compute the server speed; in here optimized server speed is formulated; consider a server 
speed of Ry and represented through 8) = (aj1,---M ape,—13 Tha Thcy,)- Moreover, for given server |p. 
Execution speed is S,,. Thus, execution speed can be formulated as (21). 
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Further, closed form solution is derived for various quantities such as average power, average task 
response, server utilization, average execution speed, these can be formulated and optimized through (22). 


On = Dyes lon (22) 
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Using above task response time is formulates as (23). 

Sh = On/An (23) 
Further, considering these parameters we design workload strategy for load balancing. 


3.5. Workload designing strategy 

In this section, we design and develop the workload strategy balance which optimizes two sub 
problem i.e., makespan time and energy. Both are considered as the sub-problem associated with load 
balancing. Further, in this section optimized task offloading is designed to solve the makespan time and 
energy problem. 


3.5.1. Makespan problem 

In this section, we focus on load balancing mechanism with makespan time minimization; further 
considering the makespan time, distribution of load must be found which can be considered as the task arrival 
time to model. Thus E (A,, Az ... Am) =A where 


E (Ay, Az « Am) = Ay + Ag $2 + Am 
9, is less than unit 
Vich<m (24) 


The load-balanced is achieved through server speed and dynamic. Given the m number of a server 
system with system size and speed (0 = (Ana Ana) Abey—a Thar Thar) The)) of Ry, power 


consumption is given as 0;,03,...,O0}, task execution requirement is given through q, arrival rate as A, now 
we need to balance the load hence the load distribution is found; can be given in (24), Further, this is 
optimized in the process. 


E (Ay, Az . Am) =A,+ Az +-1:...+Ag (25) 
(Ay, Ap Am) = PVE(Ay, Ap Am) (26) 
aS OE _ 
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® is a multiplier, given this multiplier, we find the A;, Az ...Ag used for verifying the conditionthis generates 
the value of ®. 


3.5.2. Energy optimization 

Let’s consider number of servers as n with speed as 1y,ly,13,...1q_ with speed Wy = 
(Ap Ab2 + Abcy—v Pha Th2 The, ) Of Ry. V1Shs<m. Considering the energy parameter as 
03, 03, ...,0}, with requirement of task execution as q, with arrival rate A, a load balancing must be achieved 
such that energy P(A,,A2,A3) is minimized having constraint as given (28). 


E (Ay, Az.» Am) = Ay + Ag +°:+...+Ap Oy is less than unit 
vVish<m (28) 


Further, the above problem is minimized through Lagrange multiplier and denoted as (29). 


VS (Aq, Ap + Am) = ®VE(Ay, Ap». Am) (29) 
S42 _ 
OA, OAn 


Moreover, it is observed that “ is complex function of A;, this causes highly improbable for finding the 
1 


: ‘ : i as . : . 
analytical solution, hence this can be solved through given steps; (a) As, we observe that is an increasing 
1 


function with given ®, using bisection approach we find the value of Ay, (b) Moreover, the calculated A, + 
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A, +++... +A is used for verifying the condition E (A,, A, ...A,) =A. (c) Further, this verification is 
employed to find ® through bisection method. (d) Further the optimized equation is given as (30) 


Be a 4 was 
0A, A (0, = a) (30) 


4. PERFORMANCE EVALUATION 

Resources for cloud computing have become one of the most successful real-time computing 
models for active usage. They are accessible, affordable, and can be used from anywhere in the world over 
the internet. Additionally, it may be used for a variety of purposes, one of which is workflow scheduling for 
scientific workflows where the tasks are interdependent and independent and have a range of benefits and 
drawbacks. However, these computing devices’ excessive energy consumption can impair their performance. 
Additionally, makespan is a significant restriction, thus we introduced an OLBP for heterogeneous 
computing systems to efficiently reduce energy consumption while also delivering superior performance to 
optimize these goals. Our suggested model runs on Windows 10 64-bit with 16 GB of RAM and an Intel 
Core i5-4460 processor. The CPU runs at 3.20 GHz. Eclipse Neon is being used to emulate this project. We 
utilize CloudSim as our simulator, and the 3 editor and code are written in Java [34]. We also test the 
effectiveness of our suggested model using the cybershake scientific workflow in four different situations, 
including total time comparison and power consumption. Additionally, a graphic representation of our 
findings that takes into account task count, execution time, and energy consumption is provided. We have 
taken into consideration studies with scientific workflows of 30, 50, 100, and 1000. 


4.1. Energy consumption 

Energy is one of the important parameters in any optimization problem; Figure 4 shows the energy 
consumption comparison of existing and proposed method; for first instance, existing model requires 3495.42 
whereas proposed model requires 1303.740369. For second and third instance existing model requires 
8518.395166 and 18966.33731 whereas OLBP requires 1330.921945 and 1436.836915 respectively. 
Similarly, for fourth instance existing model requires 236303.2878 and proposed model OLBP requires 
3228.604586. Further, another parameter average power is considered to evaluate the model less power 
required, the more efficient is the model. Figure 5 shows average power comparison where it is observed that 
in case of all instances, existing model requires 19.14, 20.11, 0.30, and 20.11 whereas proposed model 
requires 15.81, 15.90, 15.90, and 15.90 respectively for all four instances. 
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Figure 4. Energy consumption 
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4.2. Makespan time 

Here, for first, second, third and fourth instance takes makespan time of 6359.41, 14448.9, 
30124.41, 74543 whereas OLBP requires makespan time of 2938.22, 2953.86, 3116.22, 4794.39 and 1280.85 
respectively. Figure 6 displays the power sum comparison. Figure 7 shows the makespan time comparison. 
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Figure 6. Power sum 
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Figure 7. Makespan 


4.3. Comparative analysis 

In this section, we compare the effectiveness of the existing and suggested method while taking into 
account a number of different parameters, including energy consumption, power total, average power, and 
makespan time on the cybershake workflow. In the above section, we observed that, in terms of energy for 
instance 1, there is 62% less consumption, for instance 2, 84.37% less consumption, for instance 3, 92.42% 
less consumption and 98.63% less consumption for instance 4. Similarly, considering average power for 
instance 1, instance 2, instance 3, and instance 4, proposed model consumes 17.40%, 20.96%, 21.70%, and 
20.96% efficient than the existing mechanism respectively. Furthermore, considering the power sum as 
parameter, proposed model consumes 61.83%, 83.41%, 91.90%, and 94.91% less power sum than the 
existing protocol for instance 1 to instance 4 respectively. At last, makespan time is compared and it is 
observed that proposed model possesses 53.79% efficient for instance 1, 79.55% efficient for instance 2, 
89.65% efficient for instance 3 and 93.56% efficient for instance 4 than the existing model. Furthermore, 
throughout the comparison it is observed that in case of each parameter, as the workflow variant changes and 
as number of tasks increases performance degrades marginally, however proposed model remains either 
constant or decreases comparatively less. 


5. CONCLUSION 

Workflow management systems is considered to be eminent in construction of various application 
such as e-commerce, scientific application and so forth through automation process of inter-enterprise and 
intra-enterprise business model. Moreover, to handle the dynamic business environment and increase in 
global competition has led the researcher to design the scalable and flexible business environment; these 
phenomena can be achieved through load balancing. Further, one of the barriers to green cities and a pressing 
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issue that must be resolved is the high energy consumption and increased makespan time in cloud data 
centers. To date, many load-balancing algorithms have been created in an effort to cut down on energy use 
and process execution timespan. As a result, in this research effort OLBP is established, which schedules and 
distributes the workload in a way that important parameters like power, energy, and makespan time are 
maximized. This research work is aimed to address the issue. Moreover, to evaluate the OLBP, four 
distinctive instances are designed; each instances comprises workflow variants with random number of 
virtual machines. Further comparative analysis indicates that OLBP outperforms the existing model in terms 
of average power, energy consumption, makespan time. Although OLBP outperforms the existing protocol, 
there are lot of research which needs to research; in future we would be focusing on optimizing the other 
performance parameter such as fault tolerance, reliability, and cost. 
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